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ABSTRACT 



A plausible ^-factor solution for many types of psychological and educa- 
tional tests is one in which there is one general factor and s — 1 group or 
method related factors. Th ^ bi-factor solution results from the constraint that 
each item has a non-zero loading on the primary dimension aj\ and at most 
one of the 5 — 1 group factors. This structure has been termed the ^bi-factor" 
solution by Holzingei k Swineford, but it also appears in the work of Tucker 
and Joreskog. Ml attempts at estimating the parameters of this model have 
been restricted to continuously measured variables; it has not been previously 
considered in the context of item-response theory (IRT). It is conceivable, how- 
ever, that the bi-factor structure might arise in IRT related problems. 

The purpose of this paper is to derive a bi-factor item-response model for 
binary response data, and to develop a corresponding method of parameter 
estimation. This restriction leads to a major simplification of the likelihood 
equations that (1) permits the statistical evaluation of problems of unlimited 
dimensionali^^y, (2) permits conditional dependence among discrete and previ- 
ously identified subsets of items, and (3) in some cases provides more parsimo- 
nious factor solutions than an unrestricted full-information item factor analysis 
might provide {e.g.. Bock and Aitkin, 1981). 
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1 Introduction 



Consider the case in which, for n variables, an 5-factor solution exists in which 
there is one general factor and 5 - 1 group or method related factors. The bi- 
factor solution constrains each item to have a non-zero loading on the primary 
dimension q^, and on not more than one of the s - 1 group factors (j s., 
<^;A»^ = 2 5)- For four items, the factor-pattern matrix might be 

■ Oil 0 

^ _ 0121 0122 0 
Ol3\ 0 Q33 

. 041 0 Q43 . 

This structure has been termed the "bi-factor" solution by Holzinger & 
Swineford (1937), inter-battery factor analysis by Tucker (1958), and is also 
one of the confirmatory factor analysis models considered by Joreskog (1969). 
In these applications, the model is restricted to test scores, assumed to be con- 
tinuously distributed. It is easy, however to conceive of situations where the 
bi-factor pattern might arise at the item level. It is plausible for paragraph 
comprehension tests, for example, in which case the primary dimension de- 
scribes the targeted aptitude and the additional factors describe knowledge of 
the content area within the paragraphs. In this context, items would be condi- 
tionally independent between pa/agraphs. but conditionally dependent within 
specific paragraphs. 

The purpose of this paper is to derive an item-response model for binary re- 
sponse data that exhibit the bi factor structure and to develop a corresponding 
method of parameter estimation. Of course, other types of tests that consist 
of items tapping different content areas would also be suitable for this type of 
analysis. .As we will show, this restricti n leads to a major simplification of the 
likelihood equations that (1) permits the statistical evaluation oi' problems of 
unlimited dimensionality. (2) permits conditional dependence among discrete 
and previously identified subsets of items, and (3) in some cases provides more 
parsimonious factor solutions than an unrestricted full-information item factor 
analysis might provide {e.g.. Bock and Aitkin, 1981). In the following sections, 
derive the likelihood and its first derivatives so that an EM solution to item 
bi-factor analysis may be obtained. 



2 Likelihood Evaluation 

Stuart (1958) showed that if n variables follow a standardized multivariate nor- 
mal disiribution where the correlation p,j = ZU and a,H is nonzero for 



only one h, then the probability that the respective variables are simultaneously 
less than 7j is, 



-EL 



f{y)<iy 



in 



where 

/(0=exp(-^f^)/{27r)»/2 
F{t)= r f{t)dt 

J —CO 

and M/i is the number of items loading on dimension A (A = 1, . . . ,s). 

Equation (1) follows from the fact that if each variate is related to only a 
single dimension, then the s dimensions are independent, and the joint prob- 
ability is sirnply the product of the 5 unidimensiond probabilities. In the 
present context, this result only applies to the 5 — 1 '"nuisance" dimensions 
(i.e., A = 2, . . . , 5); if a primary dimension exists, it will not be independent of 
the other 5 - I dimensions. To compute this probability therefore requires a 
two-dimensional generalization of Stuart's (1958) original result. 

To derive the tw^o-rdimcnsional result, we begin by noting that the proba- 
bility of the primary dimension can be obtained using the formula of Dunnett 
andSobpl (1955), 



.J=i 



f{y)dy, 



(2) 



which is valid as long as p^j = a,aj. Of course, this directly implies a unidi- 
mensional problem. Combining the two results yields. 



/OO I ^ fC 
n/ 



f{y)dy} f{z)dz, 



(3) 



which can be approximated to any practical degree of accuracy using Gauss- 
Rermite quadrature (Stroud and Sechrest, 1966). What is important about 
this result is, if the assumptions are reasonable (as they clearly are for many 
IRT applications), then the probability of any response pattern can be obtained 
by a two-diraensional integration, regardless of the dimensionality 5. 
For example, if yj ~ T.h=\ ^jh^h + cj and we assume that 



vi - mi), 

6 ^ Ar(0,I), and 

- mi -Em, 



then the unconditional probability of observing score pattern x = x< is, 



which can be approximated by. 




9h J = \ 



(4) 
(5) 



where 



A',J = F /_T.- a,iA,. -a,,.Y, 



and X, and /l(X,) are the nodes and corresponding weights of a Gauss- Hermite 
quadrature. 

3 Marginal Maximum Likelihood Estimation 

T - parameters of the item bi-factor analysis model can be estimated by the 
method of marginal maximum likelihood using a variation of the approach 
described by Bork & Aitkin (1981). The parameters of this model include n 
"thresholds" or "intercepts", n primary factor loadings or "slopes" and a total 

of^n factor loadings or slopes on the /i = 2 5 additional dimensions (i.e., 

IIfc=2"'i = The likelihood equations are derived as follow s. Let 
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and 



Nh{^) = ilrt[E(,{X,,)]L(H{X,,.XJ/P(. (13) 

It should be noted that these equations are similar to tliose in the unrestricted 
case, except that in the bi-factor case, the conditional probability of response 
pattern X(h (^-e., responses to Hems j = l,...,n^ in subsection h for respo;,se 
pattern C) is weighted by the factor, Furthermore, since each item 

only appears in one subsection (h), the iV now vary with h, in contrast to 
the unrestricted case. As such, the Nh denote the effective sample size for 
subset h at quadrature point .Y^^). When weighted by AfX) and summed 
over the quadrature nodes for each subsection, yields the total number of 
respondents, whereas the corresponding weighting and summation for fj yields 
the total number of respondents answering item j correctly. 

From provisional parameter values, each E-Step yields fj and Nh^ the expec- 
tations of the complete data statistics computed conditional on the incomplete 
data (see Bock, Gibbons. & Muraki. 1988). The subsequent M-step solves 
equation (10) using conventional maximum likelihood multiple probit analy- 
sis, substituting the provisional expectations of fj and (see Bock & Jones, 
1968). 

4 Illustration 

To illustrate the application of the bi-factor IRT model, we have evaluated 20 
items selected from an ACT natural science test, for a random 5 mple of 1000 
examinees (we are indebted to Terry Ackerman and Mark Reckase for these 
data). TliiS test involves a series of questions regarding each of four paragraphs. 
For the purpose of this illustration, we selected the first 5 items from each of 
four paragraphs. 

Table 1 displays the unrestricted promax-rotated 4-factor solution, which 
adequately fit these data (improvement in fit of a four-factor model over a 
three- factor model was x\j = 31.59, p < .02; the improN'ement in fit of five 
factors over four factors was not significant (\fg = 18.44, p < .30). Inspection 
of Table 1 reveals that each factor is domin^.ted by items from a particular 
paragraph. In contrast, the estim?*,ted factor loadings for the bi-factor model 
(see Table 2) with 3 = 5 (i.e., one primary dimension and four paragraph- 
specific dimensions) revealed a strong general ability dimension, as well as 
appreciable within p.-^ragraph associations. The fit of the restricted model was 
not significantly different from the fit of either the four-factor {\\^ = 23.83,/) < 
.99) or the five-factor (\^ = 43.22. p < .95) unrestricted models. Inspection 

6 
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^ lAtlL U{0)m)de,] f{er)de^. (6) 

Then the log likelihood is, 
5 

logi: = 53r^ log (7) 

where 5 denotes the number of unique response patterns. The derivative of 
the log marginal likelihood with respect to a general item parameter i/j is as 
follows. 
Let 



Then 
9 log 



^^E^f^l (9) 

(10) 

Following Bock and Aitkin (1981), the margina. likelihood equations can 
be solved, usin^ the EM algoiithm of Dempster, Laird k Rubin (1977), by 
replacing the integrals with Gauss- Hermite quadratures and rearranging terms 
into the two-dimensional form: 

where 

5 

^;(X) = ^tXiAEtkiX,, )]U{X,, . .Y,, )IP, (12) 
^=1 



of the loadings within each paragraph reveals that the intra-paragraph item 
associations are quite variable. 

As a computational note, we should point out that the numerical precision 
of the bi-factcr solution represents a major improvement over the unrestricted 
solution- Given that the bi-factor solution only requires approximation of a 
two-dimensional integral, we were able to use 100 quadrature points (i.e., 10 in 
each dimension) instead of the 243 quadrature points used in the unrestricted 
five factor solution, (i.e., 3 in each dimension)* Five factors probably represents 
the highest dimensional solution t'lat is computation.- 1 tractable at this time* 
Parameters of the unrestricted models were estimated using the TESTFACT 
program (Wilson, Wood k Gibbons, 1984). 



5 A Simple Structure Model 

Consider an orthogoni^l simple structure factor model in which each item loads 
on one and only one of dimensions. This satisfies a complete simple struc- 
ture model as defined by Thurstone (1947), which for measurement data could 
be evaluated using methods for confirmatory factor analysis (Joreskog, 1969). 
This is, of course, a simplification of the bi-factor model in which there is no 
primary cimension. In this case, the unconditional probability in (5) is reduced 
to the unidime*»5ional form. 



(14) 



w 



here 



7j ~ OjhXi 



V 



1 - a 



that is, (5) reducM to the product of the 5 independent unidimensional prob- 
abilities. The likelihood equations in (11) can then be approximated by, 



dlogL ^ f r-,(X,J - Nk{XJF,{XJ ( dF,{XJ \ 



(15) 



where 



e=i 



(16) 
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and 



In this case, e^, represents the constant 



(17) 



and 

It is interesting to note that fj and Nh now only contain information f^om 
the specific subset of items (h) for which item j is a member. This is, of 
course, due to the independence between the subsets that results from the 
simple structur<f. 

Application of the simple structure model to the ACT natural science test 
example yields the it- -ii- parameters displayed in Table 3. Inspection of the 
parameter estimates in Table 3 reveals that removal of the primary factor in- 
creases the magnitude of the loadings on the individual paragraph dimensions. 
In terms cf model fit. both the bi-factor model {xl^ = 336,p < .0001) and 
the unrestricted four-factor model = 361. p < .0001) provide significant 
improvements in fit over the simple structure model, indicating that the test is 
:n fact measuring a primary ability dimension and not merely four independent 
realms of knowledge. 



6 Discussion 

The bi-factor model presented here provides a natural alternative to the tradi- 
tional conditionally-independent unidimensional IRT model. When potential 
sources of conditional dependence are known in advance, as in the case of 
paragraph comprehension tests or tests in which two or more methods of item 
presentation are involved, the item bi- factor solution provides an excellent al- 
ternative. An attractive by-product of this model is that it requires only the 
evaluation of a two-dimensional integral, regardless of the number of potential 
subtests, paragraphs, or content areas. Theso different content areas are, of 
course, assumed to be independent conditional on the primary ability dimen- 
sion that the test was designed to measure. As such, th^ limitations on the 
dimensionality of the full-information item factor analysis model embodied in 

S 



the TESTFACT prot^.dm (Wilson, Wood k Gibbons, 1984). do not apply. Of 
course, the subsections {e.g., paragraphs) must be known in advance. 

In certain situations, for example psychiatric measurement (Gibbons, 1985), 
the existence of a primary dimension (e.5., depression), is itself at question. In 
this case, comparison of the bi-factor and simple factor solutions presented here 
is of particular interest. Item bi-factor analysis could therefore help answer the 
question of whether depression is a unitary disorder or a mixture of a series of 
qualitatively distinct abnormalities; a question that has long plagued psychi- 
atric researchers. Comparison of the fit of the bi-factor and simple structure 
models provides a tool for investigating such problems in psychiatric research 
and other areas as well. 

Finally, those cases in which little is known about the structure of a partic- 
ular test, but little confidence can be placed in the assumption of conditional 
independence, the more genera! solution presented by Gibbons et. ai (1989), 
using Clark's (1961) formulae for the moments of n jointly normal variables, 
could be used. This procedure uses a direct approximation to the multivariate 
normal distribution that underlies the item-response function, without restric- 
tions on the form of the inter-item residual covariances. With it, the assump- 
tion of conditional independence is not required. Further work in this area is 
underway. 
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Table 1 



Full-Information Item Factor Analysis - Unrestricted Promax Solution 
ACT Natural Science Test - 20 items and 1000 subjects 



Item 


'3 


Oil 




«J3 




1 


-.215 


•401 


- 005 


.uou 




0 


-.385 


• 185 


- 019 


- flfi? 


.iUO 


3 


-.356 


•667 


- 070 




-.Uol 


4 


-.098 


•bl9 


013 


.U*x*i 


009 


0 


-.029 


•562 


- 09*> 




1 1 Q 


6 


-.582 


.129 


068 




.uou 


7 


-.585 


.184 


.«w 1. 1. 


410 


1 09 


8 


-.137 


-.037 


- 061 




1 79 


9 


-.246 


.238 


063 


• OVA 




10 


-.089 


-.224 


.128 


.620 


.060 


11 


-.049 


JS2 


.135 


-.034 


.311 


12 


-.407 


-.024 


-.065 


.124 


.320 


13 


-.265 


.247 


.082 


.020 


.173 


14 


-.051 


.137 


.005 


.007 


.585 


15 


.040 


.224 


.129 


-.045 


.295 


16 


.345 


.153 


.289 


-.122 


-.109 


17 


.167 


-.007 


.682 


.089 


-.044 


18 


.172 


-.096 


.520 


-.024 


.120 


19 


.543 


.008 


.500 


.067 


.091 


20 


.672 


-.073 


-.010 


.004 


.163 
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Table 2 



FL.l-Information Item Bi-Factor Analysis 
ACT Natural Science Test - 20 items and 1000 subjects 



Item 














I 
2 


-.230 
-.392 


.524 
.232 


.129 
.115 








3 


-.370 


.411 


.427 








4 


-.118 


.548 


.278 








5 


-.046 


.489 


.338 








6 


-.593 


.311 




.277 






7 


-.600 


.376 




.314 






8 


- 138 


.087 




-.019 






9 


-.259 


.207 




.390 






10 


-.103 


.226 




.476 






11 


-.062 


.484 






.141 




12 


-.413 


.261 






.135 




13 


-.277 


.423 






.199 




14 


-.066 


.573 






.187 




15 


.025 


.492 






.260 




16 


.340 


.112 








.261 


17 


.150 


.306 








.662 


18 


.160 


.240 








.571 


19 


.528 


..340 








.493 


20 


.671 


.061 








.031 
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Table 3 



Full-Information Simple Structure Item Factor Analysis 
AC r Natural Science Test - 20 items and 1000 subjects 



item 








aj3 




1 


-.224 


.482 








2 
3 


-.391 
-.368 


.251 
.571 








4 


-.111 


.612 








5 


-.040 


.585 








6 


-.592 




.408 






7 


-.597 




.467 






S 


-.138 




.032 






9 


-.2oS 




.429 






10 


-.102 




.509 






11 


-.0o6 






.489 




12 


-.412 






.297 




13 


-.273 






.449 




14 


-.058 






.591 




15 


.031 






.566 




16 


.341 








.282 


17 


.157 








.732 


IS 


.163 








.616 
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